# Set optionsknitr::opts_chunk$set(echo =TRUE,warning =FALSE,message =FALSE,fig.align ='center',fig.retina =2)rm(list=ls())library(tinytex)
Warning: package 'tinytex' was built under R version 4.5.2
Code
library(ggplot2)#library(table1)library(gt)library(survival)library(data.table)library(randomForest)library(grf)library(policytree)library(DiagrammeR)#library(grid)#library(forestploter)#library(randomizr)# library(devtools)# install_github("larry-leon/weightedsurv", force = TRUE)#install.packages("weightedsurv")# install_github("larry-leon/forestsearch", force = TRUE)library(forestsearch)library(weightedsurv)#help(forestsearch)#help(generate_aft_dgm_flex)# Set theme for plotstheme_set(theme_minimal(base_size =12))
1 Summary
Reproducing main GBSG analysis
1.1 Datasetup
Code
df.analysis <- gbsgdf.analysis <-within(df.analysis,{id <-as.numeric(c(1:nrow(df.analysis))) # time to monthstime_months <- rfstime/30.4375grade3 <-ifelse(grade=="3",1,0)treat <- hormon})confounders.name <-c("age","meno","size","grade3","nodes","pgr","er")outcome.name <-c("time_months")event.name <-c("status")id.name <-c("id")treat.name <-c("hormon")
# NOTE: In general for GRF trees# leaf1 --> recommend control# leaf2 --> recommend treatment# Tree depth 1plot(grf_est1$tree1,leaf.labels=c("Control","Treat"))
Code
# Tree depth 2plot(grf_est1$tree2,leaf.labels=c("Control","Treat"))
=== Bootstrap Event Count Summary ===
Total bootstrap iterations: 1000
Event threshold: <12 events
ORIGINAL Subgroup H on BOOTSTRAP samples:
Control arm <12 events: 0 (0.0%)
Treatment arm <12 events: 0 (0.0%)
Either arm <12 events: 0 (0.0%)
ORIGINAL Subgroup Hc on BOOTSTRAP samples:
Control arm <12 events: 0 (0.0%)
Treatment arm <12 events: 0 (0.0%)
Either arm <12 events: 0 (0.0%)
NEW Subgroups found: 867 (86.7%)
NEW Subgroup H* on ORIGINAL data:
Control arm <12 events: 0 (0.0% of successful)
Treatment arm <12 events: 0 (0.0% of successful)
Either arm <12 events: 0 (0.0% of successful)
NEW Subgroup Hc* on ORIGINAL data:
Control arm <12 events: 0 (0.0% of successful)
Treatment arm <12 events: 2 (0.2% of successful)
Either arm <12 events: 2 (0.2% of successful)
Code
summaries$diagnostics_table_gt
Bootstrap Diagnostics Summary
Analysis of 1000 bootstrap iterations
Category
Metric
Value
Success Rate1
Total iterations
1000
Successful subgroup ID
867 (86.7%)
Failed to find subgroup
133 (13.3%)
Success rating
Good ✓✓
Subgroup H (Questionable)
Unadjusted estimate
1.95 (1.04, 3.67)
Bias-corrected estimate
1.58 (0.86, 2.91)
Bias correction impact2
19.0%
CI width change3
2.64 -> 2.06
Subgroup Hc (Recommend)
Unadjusted estimate
0.61 (0.47, 0.80)
Bias-corrected estimate
0.65 (0.46, 0.93)
Bias correction impact2
6.1%
CI width change3
0.33 -> 0.47
Bootstrap Quality: H
Valid iterations
867
Mean (SD)
0.46 (0.40)
Coefficient of variation4
86.4%
Skewness5
-0.00
Bootstrap Quality: Hc
Valid iterations
867
Mean (SD)
-0.43 (0.22)
Coefficient of variation4
50.4%
Skewness5
0.13
Search Performance
Mean max HR found
3.13 (1.25)
Mean factors evaluated
42.2
Mean combinations tried
944
Proportion at maxk
--
1Success Rate: Proportion of bootstrap samples where ForestSearch identified a valid subgroup
2Bias Correction Impact: Percentage change from unadjusted to bias-corrected estimate
3CI Width Change: Confidence interval width before -> after bias correction
4Coefficient of Variation: Standard deviation as % of mean (lower is better)
5Skewness: Measure of asymmetry (0 = symmetric, |skew| < 1 is generally good)
Interpretation Guide:
✓ Good stability: Subgroup is reliably identified in most bootstrap samples.
⚠ High variability: Bootstrap estimates are imprecise (CV >= 25%). Consider increasing nb_boots or sample size.
Code
summaries$subgroup_summary$original_agreement
Metric Value
<char> <char>
1: Total bootstrap iterations 1000
2: Successful iterations 867
3: Failed iterations (no subgroup) 133
4: Exact match with original 62 (7.2%)
5: Different from original 805 (92.8%)
=== Bootstrap Event Count Summary ===
Total bootstrap iterations: 1000
Event threshold: <12 events
ORIGINAL Subgroup H on BOOTSTRAP samples:
Control arm <12 events: 0 (0.0%)
Treatment arm <12 events: 0 (0.0%)
Either arm <12 events: 0 (0.0%)
ORIGINAL Subgroup Hc on BOOTSTRAP samples:
Control arm <12 events: 0 (0.0%)
Treatment arm <12 events: 0 (0.0%)
Either arm <12 events: 0 (0.0%)
NEW Subgroups found: 477 (47.7%)
NEW Subgroup H* on ORIGINAL data:
Control arm <12 events: 0 (0.0% of successful)
Treatment arm <12 events: 0 (0.0% of successful)
Either arm <12 events: 0 (0.0% of successful)
NEW Subgroup Hc* on ORIGINAL data:
Control arm <12 events: 0 (0.0% of successful)
Treatment arm <12 events: 2 (0.4% of successful)
Either arm <12 events: 2 (0.4% of successful)
Code
summaries$diagnostics_table_gt
Bootstrap Diagnostics Summary
Analysis of 1000 bootstrap iterations
Category
Metric
Value
Success Rate1
Total iterations
1000
Successful subgroup ID
477 (47.7%)
Failed to find subgroup
523 (52.3%)
Success rating
Poor ⚠
Subgroup H (Questionable)
Unadjusted estimate
1.95 (1.04, 3.67)
Bias-corrected estimate
1.34 (0.81, 2.22)
Bias correction impact2
31.1%
CI width change3
2.64 -> 1.41
Subgroup Hc (Recommend)
Unadjusted estimate
0.61 (0.47, 0.80)
Bias-corrected estimate
0.64 (0.44, 0.93)
Bias correction impact2
3.8%
CI width change3
0.33 -> 0.50
Bootstrap Quality: H
Valid iterations
477
Mean (SD)
0.30 (0.39)
Coefficient of variation4
131.5%
Skewness5
0.01
Bootstrap Quality: Hc
Valid iterations
477
Mean (SD)
-0.45 (0.21)
Coefficient of variation4
47.3%
Skewness5
0.04
Search Performance
Mean max HR found
2.39 (0.69)
Mean factors evaluated
43.4
Mean combinations tried
43
Proportion at maxk
--
1Success Rate: Proportion of bootstrap samples where ForestSearch identified a valid subgroup
2Bias Correction Impact: Percentage change from unadjusted to bias-corrected estimate
3CI Width Change: Confidence interval width before -> after bias correction
4Coefficient of Variation: Standard deviation as % of mean (lower is better)
5Skewness: Measure of asymmetry (0 = symmetric, |skew| < 1 is generally good)
Interpretation Guide:
⚠ Poor stability: Subgroup is rarely identified. Consider:
Reviewing subgroup criteria (n.min, hr.threshold)
Increasing sample size significantly
Simplifying search (reduce maxk)
Examining if subgroup is real or spurious
⚠ High variability: Bootstrap estimates are imprecise (CV >= 25%). Consider increasing nb_boots or sample size.
Code
summaries$subgroup_summary$original_agreement
Metric Value
<char> <char>
1: Total bootstrap iterations 1000
2: Successful iterations 477
3: Failed iterations (no subgroup) 523
4: Exact match with original 309 (64.8%)
5: Different from original 168 (35.2%)